SwePub
Tyck till om SwePub Sök här!
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "db:Swepub ;pers:(Lu Zhonghai);pers:(Du G.)"

Sökning: db:Swepub > Lu Zhonghai > Du G.

  • Resultat 1-5 av 5
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Du, G., et al. (författare)
  • An analytical model for worst-case reorder buffer size of multi-path minimal routing NoCs
  • 2014
  • Ingår i: Proceedings - 2014 8th IEEE/ACM International Symposium on Networks-on-Chip, NoCS 2014. - : IEEE. - 9781479953479 ; , s. 49-56
  • Konferensbidrag (refereegranskat)abstract
    • Reorder buffers are often needed in multi-path routing networks-on-chips (NoCs) to guarantee in-order packet delivery. However, the buffer sizes are usually over-dimensioned, due to lack of worst-case analysis, leading to unnecessary larger area overhead. Based on network calculus, we propose an analysis framework for the worst-case reorder buffer size in multi-path minimal routing NoCs. Experiments with synthetic traffic and an industry case show that our method can effectively explore the traffic splitting space, as well as the mapping effects in terms of reorder buffer size with a maximum improvement of 36.50%.
  •  
2.
  • Du, G., et al. (författare)
  • NR-MPA : Non-recovery compression based multi-path packet-connected-circuit architecture of convolution neural networks accelerator
  • 2019
  • Ingår i: Proceedings - 2019 IEEE International Conference on Computer Design, ICCD 2019. - : Institute of Electrical and Electronics Engineers (IEEE). - 9781538666487 ; , s. 173-176
  • Konferensbidrag (refereegranskat)abstract
    • Convolution Neural Networks (CNNs) involve massive data to be calculated and stored. To meet the challenges above, parallel hardware accelerators consisting of hundreds of Processing Elements (PEs) arranged as a many-core systemon-chip, connected by a Network-on-Chip (NoC) are proposed, which achieve high throughput exploiting parallel PE array. However, most of existing accelerators focus on only one aspect, such as compute structure of PE and data movement overhead above NoC, which causes the throughout, area and latency of the accelerator not fully optimized. In this paper, we propose an efficient general purpose CNN accelerator including both compute based on Non-Recovery Compression (NRC) method and data movement by novel Multi-Paths Packet Connection Circuit (MP-PCC). NRC can save computation time due to zero multiplier through shift decoding in PE and improve power efficiency by saving a large number of data transmission. MPPCC, evolved from Packet Connection Circuit, supports single and multicast transmission modes at the same time, and changes the multicast (X, Y) routing algorithm to multicast Y algorithm to improve the transmission efficiency. The proposed architecture which was implemented on Xilinx FPGA achieves 17.7x faster computation speed and 2.2x fewer memory accesses compared with the state-of-the-art method.
  •  
3.
  • Du, G., et al. (författare)
  • Work-in-progress : SSS: Self-aware system-on-chip using static-dynamic hybrid method
  • 2017
  • Ingår i: Proceedings of the 2017 International Conference on Compilers, Architectures and Synthesis for Embedded Systems Companion, CASES 2017. - New York, NY, USA : Association for Computing Machinery (ACM). - 9781450351843
  • Konferensbidrag (refereegranskat)abstract
    • Network on chip has become the de facto communication standard for multi-core or many-core system on chip, due to its scalability and flexibility. However, temperature is an important factor in NoC design, which affects the overall performance of SoC-decreasing circuit frequency, increasing energy consumption, and even shortening chip lifetime. In this paper, we propose SSS, a self-aware SoC using a static-dynamic hybrid method, which combines dynamic mapping and static mapping to reduce the hot-spots temperature for NoC based SoCs. First, we propose monitoring the thermal distribution for self-state sensoring. Then, in static mapping stage, we calculate the optimal mapping solutions under different temperature modes using discrete firefly algorithm to help self-decision making. Finally, in dynamic mapping stage, we achieve dynamic mapping through configuring NoC and SoC sentient unit for selfoptimizing. Experimental results show SSS can reduce the peak temperature by up to 30.64%. FPGA prototype shows the effectiveness and smartness of SSS in reducing hot-spots temperature. Self-awareness, SoC architecture, NoC.
  •  
4.
  • Du, G., et al. (författare)
  • Worst-case performance analysis of 2-D mesh NoCs using multi-path minimal routing
  • 2012
  • Ingår i: CODES+ISSS'12 - Proceedings of the 10th ACM International Conference on Hardware/Software-Codesign and System Synthesis, Co-located with ESWEEK. - New York, NY, USA : ACM Publications. - 9781450314268 ; , s. 123-132
  • Konferensbidrag (refereegranskat)abstract
    • In Network-on-Chip (NoC), multi-path routing is often preferable than single-path routing since it can better balance workload and thus provide better performance. However, performance analysis with multi-path routing is much more difficult due to complicated contention scenarios. Based on network calculus, we study worst-case performance of deterministic multi-path minimal routing on 2-D mesh NoCs. We first present a per-flow delay bound analysis technique for multi-path routing, which extends the analysis for singlepath routing but deals with traffic splitting. Then we define a contention matrix to capture network congestion status. Based on the contention matrix, we propose an effective nonuniform traffic splitting strategy to improve worst-case performance. Experiments with synthetic traffic flows and an industrial case show that our analysis can effectively explore the traffic splitting space, and verify the effectiveness of the non-uniform splitting policy.
  •  
5.
  • Saggio, Alberto, et al. (författare)
  • Validating delay bounds in networks on chip : Tightness and pitfalls
  • 2015
  • Ingår i: Proceedings of IEEE Computer Society Annual Symposium on VLSI, ISVLSI. - : Institute of Electrical and Electronics Engineers (IEEE). ; , s. 404-409
  • Konferensbidrag (refereegranskat)abstract
    • Analytical methods for estimating on-chip network performance can be very useful to accelerate and simplify the design process of Networks on Chip. However, in order to increase the confidence in these approaches it is fundamental to perform systematic studies that assess their potential. We present a methodical investigation on the tightness between analytical end-to-end delay bounds and worst-case simulation latencies in various scenarios. We first introduce our network calculus based analytical technique to derive per-flow communication delay bounds. Then, we examine the worst-case performance analysis process in NoCs outlining the major aspects that affect the tightness. Finally, experimental results confirm our deductions and allow us to provide general guidelines to avoid pitfalls in the validation process of analytical delay bounds.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-5 av 5
Typ av publikation
konferensbidrag (5)
Typ av innehåll
refereegranskat (5)
Författare/redaktör
Gao, M. (3)
Li, Z (2)
Wang, C. (1)
Zhang, D. (1)
visa fler...
Yang, Z. (1)
Zhang, C. (1)
Ma, S. (1)
Li, M. (1)
Yin, Y. (1)
Zhao, Xueqian (1)
Ouyang, Y. (1)
Saggio, A. (1)
Saggio, Alberto (1)
visa färre...
Lärosäte
Kungliga Tekniska Högskolan (5)
Språk
Engelska (5)
Forskningsämne (UKÄ/SCB)
Teknik (5)

År

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy